Combining lexical, syntactic and prosodic cues for improved online dialog act tagging
نویسندگان
چکیده
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for exploiting prosody results in an absolute improvement of 8.7% over the use of most other widely used representations of acoustic correlates of prosody. The proposed scheme is discriminative and exploits context in the form of lexical, syntactic and prosodic cues from preceding discourse segments. Such a decoding scheme facilitates online DA tagging and offers robustness in the decoding process, unlike greedy decoding schemes that can potentially propagate errors. Our approach is different from traditional DA systems that use the entire conversation for offline dialog act decoding with the aid of a discourse model. In contrast, we use only static features and approximate the previous dialog act tags in terms of lexical, syntactic and prosodic information extracted from previous utterances. Experiments on the Switchboard-DAMSL corpus, using only lexical, syntactic and prosodic cues from three previous utterances, yield a DA tagging accuracy of 72% compared to the best case scenario with accurate knowledge of previous DA tags (oracle), which results in 74% accuracy. ! 2009 Elsevier Ltd. All rights reserved.
منابع مشابه
Exploiting prosodic features for dialog act tagging in a discriminative modeling framework
Cue-based automatic dialog act tagging uses lexical, syntactic and prosodic knowledge in the identification of dialog acts. In this paper, we propose a discriminative framework for automatic dialog act tagging using maximum entropy modeling. We propose two schemes for integrating prosody in our modeling framework: (i) Syntaxbased categorical prosody prediction from an automatic prosody labeler,...
متن کاملLexical, Prosodic, And Syntactic Cues For Dialog Acts
The structure of a discourse is reflected in many aspects of its linguistic realization, including its lexical, prosodic, syntactic, and semantic nature. Multiparty dialog contains a particular kind of discourse structure, the dialog act (DA). Like other types of structure, the dialog act sequence of a conversation is also reflected in its lexical, prosodic, and syntactic realization. This pape...
متن کاملProduction of English Lexical Stress by Persian EFL Learners
This study examines the phonetic properties of lexical stress in English produced by Persian speakers learning English as a foreign language. The four most reliable phonetic correlates of English lexical stress, namely fundamental frequency, duration, intensity, and vowel quality were measured across Persian speakers’ production of the stressed and unstressed syllables of five English disyllabi...
متن کاملDialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech
We describe a statistical approach for modeling dialog acts in conversational speech, i.e., speechact-like units such as Statement, Question, Backchannel, Agreement, Disagreement, and Apology. Our model detects and predicts dialog acts based on lexical, collocational, and prosodic cues, as well as on the discourse coherence of the dialog act sequence. The dialog model is based on treating the d...
متن کاملReliability of Lexical and Prosodic Cues in Two Real-life Spoken Dialog Corpora
The present research focuses on analyzing and detecting emotions in speech as revealed by task-dependent spoken dialogs corpora. Previously, we have conducted several experiments on a real-life corpus in order to develop a reliable annotation method and to detect lexical and prosodic cues correlated to the main emotion class. In this paper we evaluate both the robustness of the annotation schem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 23 شماره
صفحات -
تاریخ انتشار 2009